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5 BACKGROUND OF THE INVENTION 

1. Field of the Invention 

This invention generally relates to digital image 
communication processes and, more particularly, to a system and method 
for adaptively controlling the bit rate, when transcoding between 
10 compressed video protocols. 

2. Description of the Related Art 

Compressed digital video is widely used in multimedia 
applications. There exist many digital video coding standards. Different 
applications and environments have different video stream requirements. 
15 Therefore, the conversion of digital video bitstreams from one compressed 
format, into another, is necessary. This process is called video 
transcoding. The format change may be a different bitrate, frame size, or 
even compression standard. 

The conventional rate control method uses a one-pass 
20 process. With conventional rate control, the encoder makes assumptions 
concerning the picture types, without knowledge of the sequence. It 
controls bit allocations and quantization based on the pictures already 
coded. This is not optimal for the overall bit allocation. 

As noted in US Patent 6,310,915, in the MPEG-2 standard 
25 pictures are both spatially and temporally encoded. Each picture is first 
divided into non-overlapping macroblocks, where each macroblock 
includes a 16x16 array of luminance samples and each block or array of 
8x8 chrominance samples overlaid thereon. A decision is made to encode 
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the macroblock as an inter macroblock, in which case the macroblock is 
both temporally and spatially encoded, or to encode the macroblock as an 
intra macroblock, in which case the macroblock is only spatially encoded. 
A macroblock is temporally encoded by an inter-picture motion 
5 compensation operation. According to such an operation, a prediction 
macroblock is identified for the to-be-motion compensated macroblock and 
is subtracted therefrom to produce a prediction error macroblock. The 
prediction macroblock originates in another picture, called a reference 
picture, or may be an interpolation of multiple prediction macroblocks, 

10 each originating in different reference pictures. The prediction 

macroblock need not have precisely the same spatial coordinates (pixel 
row and column) as the macroblock from which it is subtracted and in fact 
can be spatially offset therefrom. A motion vector is used to identify the 
macroblock by its spatial shift and by the reference picture from which it 

15 originates. (When the prediction macroblock is an interpolation of 

multiple prediction macroblocks, a motion vector is obtained for each to- 
be-interpolated prediction macroblock). 

Pictures may be classified as intra or I pictures, predictive or 
P pictures and bidirectionally predictive or B pictures. An "I" picture 

20 contains only intra macroblocks. A "P" picture may contain inter 
macroblocks, but only forward directed predictions from a preceding 
reference picture are permitted. A "P" picture can also contain intra 
macroblocks for which no adequate prediction was found. In addition, a 
dual prime prediction may be formed for a P picture macroblock in an 

25 interlaced picture, which is an interpolated prediction from the 

immediately two preceding reference fields. A "B" picture can contain 
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intra macroblocks, inter macroblocks that are forward motion 
compensated, inter macroblocks that are backward motion compensated, 
i.e., predicted from a succeeding reference picture, and inter macroblocks 
that are bidirectionally motion compensated, i.e., predicted from an 
5 interpolation of prediction macroblocks in each of preceding and 

succeeding reference pictures. If the P or B pictures are interlaced, then 
each component field macroblock can be separately motion compensated 
or the two fields can be interleaved to form a frame macroblock and the 
frame block can be motion compensated at once. 

10 Spatial compression is performed on selected 8x8 luminance 

pixel blocks and selected 8x8 pixel chrominance blocks of selected 
prediction error macroblocks, or selected intra macroblocks. Spatial 
compression includes the steps of discrete cosine transforming each block, 
quantizing each block, zig-zag (or alternate) scanning each block into a 

15 sequence, run-level encoding the sequence and variable length encoding 
the run-level encoded sequence. Prior to discrete cosine transformation, a 
macroblock of a frame picture may optionally be formatted as a frame 
macroblock, including blocks containing alternating lines of samples from 
each of the two component field pictures of the frame picture, or as a field 

20 macroblock, where the samples from different fields are arranged into 
separate blocks of the macroblock. The quantization parameter may be 
changed on a macroblock-by-macroblock basis and the weighting matrix 
may be changed on a picture-by-picture basis. Macroblocks, or coded 
blocks thereof, may be skipped if they have zero (or nearly zero) valued 

25 coded data. Appropriate codes are provided into the formatted bitstream 
of the encoded video signal, such as non-contiguous macroblock address 
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increments, or coded block patterns, to indicate skipped macroblocks and 
blocks. 

Additional formatting is applied to the variable length 
encoded sequence to aid in identifying the following items within the 
5 encoded bitstream: individual sequences of pictures, groups of pictures of 
the sequence, pictures of a group of pictures, slices (contiguous sequences 
of macroblocks of a single macroblock row) of pictures, macroblocks of 
slices and motion vectors and blocks of macroblocks. Some of the above 
layers are optional, such as the group of pictures layer and the slice layer, 

10 and may be omitted from the bitstream if desired. (If slice headers are 

included in the bitstream, one slice header is provided for each macroblock 
row.) Various parameters and flags are inserted into the formatted 
bitstream as well indicating each of the above noted choices (as well as 
others not described above). The following is a brief list of some of such 

15 parameters and flags: picture coding type (I,P,B), macroblock type (i.e., 

forward predicted, backward predicted, bidirectionally predicted, spatially 
encoded only) macroblock prediction type (field, frame, dual prime, etc.), 
DCT type (i.e., frame or field macroblock format for discrete cosine 
transformation), the quantizer scale code, etc. 

20 Generally speaking, it is desirable to use the same picture 

coding type and the same intra/inter macroblock decisions in the 
subsequent encoding of the transcoding operation as was done in 
originally encoding the video signal fed to the transcoder. This maintains 
picture quality. 

25 As noted in US Patent 6,587,508, a conventional transcoder 

is designed to input first bit streams at a predetermined input bit rate 
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through the input terminal, to convert the first bit streams into second bit 
streams to be output at a predetermined output bit rate, i.e., a target bit 
rate, equal to, or lower than the input bit rate of the inputted first bit 
streams. The conventional transcoder may comprise a variable length 
5 decoder, a de-quantizer, a quantizer, a variable length encoder, and a rate 
controller. 

The variable length decoder is designed to decode a coded 
moving picture sequence signal within the first bit streams to reconstruct 
an original picture data for each of pictures including a matrix of original 

10 quantization coefficients. The de-quantizer is designed to input the 

matrix of original quantization coefficients level from the variable length 
decoder and the first quantization parameter. The de-quantizer is further 
designed to inversely quantize the inputted matrix of original 
quantization coefficients level with the first quantization parameter to 

15 generate a matrix of de-quantization coefficients, referred to as "dequant", 
i.e., DCT coefficients, for each of macroblocks as follows: 

dequant = {2 x level + sign(level)} x QlxQM DIVIDED 32; (a) 

or, 

20 dequant = level x QlxQM DIVIDED 16; (b) 

where the equation (a) is used for the inter macroblock, while 

the equation (b) is used for the intra macroblock. QM is a matrix of 

quantization parameters stored in a predetermined quantization table. 

The first quantization parameter Ql and the matrix of quantization 
25 parameters QM are derived from the inputted first bit streams by the 

decoder. Here, the original quantization coefficients level, the de- 
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quantization coefficients dequant, the matrix of quantization parameters 
QM, and the first quantization parameter Ql are integers. The de- 
quantization coefficients dequant calculated by the equations (a) and (b) 
should be rounded down to the nearest one. 
5 The quantizer is designed to input the matrix of de- 

quantization coefficients dequant from the de-quantizer and then quantize 
the inputted matrix of de-quantization coefficients dequant for each of 
macroblocks with a second quantization parameter, referred to as "Q2" 
hereinlater, to generate a matrix of re-quantization coefficients, referred 
10 to as "tlevel", as follows: 



tlevel = dequant x 16 DIVIDED Q2xQM; (c) or, 

tlevel = dequant x 16 DIVIDED Q2xQM + sign(dequant) x 1 
DIVIDED 2; (d) 

15 where the equation (c) is used for the inter macroblock, while 

the equation (d) is used for the intra macroblock. The second quantization 
parameter Q2 is obtained by the rate controller. Here, the re-quantization 
coefficients tlevel and the second quantization parameter Q2 are also 
integers. The re-quantization coefficients tlevel calculated by the 

20 equations (c) and (d) should be rounded down to the nearest one. 

The variable length encoder is designed to input the re- 
quantization coefficients tlevel from the quantizer and then encode the 
inputted matrix of the re-quantization coefficients tlevel to generate an 
objective picture data for each of pictures to sequentially output the 

25 objective picture data in the form of the second bit streams. The variable 
length encoder is designed to input a diversity of information included in 
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the first bit streams necessary for the second bit streams from the 
variable length decoder. 

The rate controller is designed to perform a rate control over 
the encoding in the conventional transcoder according to the TM-5 on the 
5 basis of the information obtained from the de-quantizer as described 
below. 

The transcoder, however, has no information on the structure 
of group of pictures, such as a picture rate of I or P-pictures within each of 
the group of pictures, so that the transcoder must estimate the structure 

10 of group of pictures within the inputted moving picture sequence to 

allocate bits for each type of pictures within the estimated structure of 
group of pictures. Furthermore, the transcoder is required to decode the 
first bits streams almost all over the layers, such as the sequence layer, 
the group of pictures layer, the picture layer, the slice layer, and the 

15 macroblock layer in order to derive necessary data for transcoding from 
the first bits streams. This operation wastes time, thereby causing the 
delay in the transcoding process. 

An improved convention is adapted to perform the rate 
control without estimating the structure of group of pictures. This 

20 transcoder further comprises a delay circuit. The delay circuit is 

interposed between the variable length decoder and the de-quantizer and 
designed to control the flow of the signal from the variable length decoder 
to the de-quantizer. The delay circuit is operated to delay starting the de- 
quantizing process in the de-quantizer until the variable length decoder 

25 has been finished to decode one of the pictures in the coded moving picture 
sequence signal. However, the de-quantizer must wait until the decoding 
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process of the picture has been completed over the entire target 
transcoding frame, thereby causing the delay in the transcoding process. 

Another conventional transcoder includes a target output bit 
updating unit and a quantization parameter computing unit, in addition 
5 to a target ratio computing unit and a bit difference computing unit. This 
transcoder can perform the rate control on the basis of the formation on 
the number of coding bits previously recorded in the input bit streams. 
This transcoder has information on the number of coding bits previously 
recorded in the bits stream, making it possible to solve the problem of the 

10 delay in the second conventional transcoder. The third conventional 
transcoder, however, has another problem. The encoder that is linked 
with the third transcoder must provide the above information on the 
number of coding bits to be recorded in the bit streams, thereby causing 
the delay of process in the encoder. 

15 In the case of a transcoder, the picture coding type and 

inter/intra macroblock decision is preferably constrained to be the same 
during a successive encoding as it was during the previous encoding. As 
such, the encoder of a transcoder has only two options available for 
varying the encoding. First, while the transcoder's decoder decodes 

20 pictures of the bitstream, information regarding the decoded picture types 
can be gathered. The transcoder's encoder extrapolates from this 
information as to what picture types are expected and allocates bits 
accordingly. However, this solution does not work well if the group of 
pictures structure of the bitstream changes. For example, the group of 

25 picture structure can change from IBBPBBPBBPBBI to IIIIIII. In such a 
case, the extrapolation of picture coding type will be erroneous. In the 
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example above, the unanticipated rise in I picture frequency will result in 
an incorrect allocation of bits and degraded quality for unanticipated I 
pictures. 

Second, the transcoder can make no assumption about 
5 picture types and simply scale the number of bits used in the original 
encoding according to the ratio of the bit rate of the originally encoded 
bitstream to the bit rate of the re-encoded bitstream produced by the 
transcoder. However, this solution does not work well if the bit rate of the 
originally encoded bitstream fed to the transcoder is far higher than the 

10 bit rate of the re-encoded bitstream produced by the transcoder. The 
reason for this is that the difference in the number of bits used for 
different picture coding types is inversely correlated with the bit rate of 
the signal. Thus, at very high bit rates, B pictures have a similar number 
of bits of encoded data as I pictures yet at low bit rates, I pictures have far 

15 more bits of encoded data than B pictures. 

It would be advantageous if the transcoding process could 
take advantage of the known complexity of the input bitstream, as 
expressed in the number of bit per frame and the quantization per frame, 
to determine the quantization factor of the output bitstream. 

20 

SUMMARY OF THE INVENTION 

The present invention introduces a novel method of picture- 
level rate control during transcoding. As way of an example, MPEG-2 to 
MPEG-4 transcoding is demonstrated. The transcoding process begins 
25 with a compressed bitstream, an MPEG-2 bitstream for example. The 
information embedded in the input stream is used in the present 
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invention to achieve a better rate control. During the MPEG-2 decoding 
pass, information derived from decoding the input bitstream, such as the 
picture type, bits used by each frame, and/or the average quantization 
parameter (Qp) of each frame, is gathered. This information is used for 
5 the external rate control of the MPEG-4 encoding pass. A Qp scale factor 
is adaptively estimated to scale the Qp from the input video stream. As 
used herein, the Qp associated with the input stream is expressed herein 
as Qi, and the Qp associated with the output stream is called Qo. 

The scale factor consists of two parts: one is the complexity 

10 ratio information between the actual (current) output complexity and the 
input complexity; the other is the bitrate correction factor based on the 
ratio of the actual bits produced verses the target bits. Both factors are 
adaptively adjusted over the encoding process. Separate rate controls are 
performed for different picture types (I, P, and B). In this manner, the bit 

15 allocation among different picture types tracks the allocation in the input 
MPEG-2 stream. By using this rate control method, the encoder can meet 
the bitrate target very closely and achieve an overall better visual quality 
than using the internal (conventional) MPEG-4 rate control. 

Accordingly, a method is provided for adaptive rate control in 

20 the transcoding of video streams. The method comprises: accepting 
frames of an input MPEG encoded video stream; decoding the video 
stream; determining video stream complexity; for each frame, calculating 
an output video stream quantization parameter (Qo) responsive to 
determined video stream complexity; and, encoding the output video 

25 stream into a protocol using Qo. 
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Some aspects of the method further comprise accepting a 
target bit rate ratio (r) for transcoding the video stream that is equal to 
the ratio of the target output video stream number of bits per frame (No), 
to the input video stream number of bits per frame (M) as follows: 

5 

r = No/Ni. 



Then, Qo is calculated in response to the value of r, as well as 
the video stream complexity. 

10 More explicitly, Qo is calculated in response to a complexity 

ratio of: an accumulated complexity in the output video stream, to an 
accumulated complexity in the input video stream. The accumulated 
complexity in the input video stream is the product of Qi times Ni, 
accumulated over a plurality of frames. Likewise, the accumulated 

15 complexity of the output video stream is the product of Qo times No, 
accumulated over the plurality of frames. 

Therefore, the complexity ratio (a k ) can be expressed as 

follows: 



20 



25 



j=0 

a k =t^ ; 

j=0 

where j equals the plurality of frames; and, 
where k is the current frame. 

Other aspects of the method further comprise: determining 
an actual bit rate ratio (r) for transcoding the video stream as follows: 



-11- 

slal359 



r' = No/Ni; 

where No and Ni are accumulated over a plurality of frames; 
and, determining a feedback correction factor (B k ) responsive to the value 
of r' (B k = r'/r). Then, the calculation of Qo includes modifying the value of 
5 Qo in response to B k . 

Additional details of the above-described method, and a 
system for adaptive rate control in the transcoding of video streams, are 
provided below. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a schematic block diagram of the present invention 
system for adaptive rate control in the transcoding of video streams. 

Fig. 2 is a diagram with steps summarizing one aspect of the 
present invention rate control method. 

Fig. 3 through 5 are drawings comparing the peak signal-to- 
noise ratio (PSNR) of I, P, and B picture types, respectively, encoded using 
the present invention and CBR methods. 

Fig. 6 is a flowchart illustrating the present invention 
method for adaptive rate control in the transcoding of video streams. 

Fig. 7 is a flowchart depicting an alternate aspect of the 
present invention method for adaptive rate control in the transcoding of 
MPEG video streams. 

DETAILED DESCRIPTION 
25 OF THE PREFERRED EMBODIMENTS 

Fig. 1 is a schematic block diagram of the present invention 

system for adaptive rate control in the transcoding of video streams. The 
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system 100 comprises a decoder 102 having an interface on line 104 to 
accept frames of an input MPEG encoded video stream and an interface 
on line 106 to supply a decoded video stream. The decoder 102 has an 
interface on line 108 to supply decoding process information. 
5 A transcoder control unit 110 has an interface on line 108 to 

accept the decoding process information. The transcoder control unit 110 
determines video stream complexity and supplies an output video stream 
quantization parameter (Qo) on line 112 that is responsive to determined 
video stream complexity for each frame of the decoded video stream. An 
10 encoder 114 has an interface on line 106 to accept the decoded video and 
an interface on line 112 to accept Qo. The encoder 114 has an interface on 
line 116 to supply an output video stream encoded into a protocol using 
Qo. Typically, the output stream protocol is different than the input 
stream protocol. 

15 In one aspect of the system 100, the transcoder control unit 

110 has an interface on line 118 to accept a target bit rate ratio (r) for 
transcoding the video stream. The target bit rate ratio r is equal to the 
ratio of the target output video stream number of bits per frame (No), to 
the input video stream number of bits per frame (M) as follows: 

20 

r = No/Ni. 



The transcoder control unit 110 calculates Qo responsive to 
the value of r, as well as in response to the video stream complexity. In 
25 one aspect of the invention, the decoder 102 accepts an MPEG-2 input 
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video stream and the encoder 114 encodes the output video stream into an 
MPEG-4 protocol. 

The decoder 102 supplies decoder processing information 
that includes an average input video stream quantization factor (Qi) for 
5 each frame. Initially, the transcoder control unit 110 calculates Qo as 
follows: 

Qo = Qilr. 

10 More explicitly, the decoder 102 accepts frames, on line 104, 

of an input MPEG encoded video stream with a plurality of slices. The 
decoder 102 calculates Qi for each frame by averaging the Qi values for 
each slice in a frame. As explained in more detail below, the calculation of 
Qo is refined with the collection of more information. 

15 Typically, the decoder 102 accepts an input MPEG encoded 

video stream with intra (I), predictive (P), and bi-directionally predictive 
(B) picture types. The transcoder control unit 110 independently 
determines the complexities of the I, P, and B picture types in the input 
video stream. Alternately stated, the transcoder control unit 110 

20 calculates a video stream complexity for each picture type. Likewise, the 
transcoder control unit 110 independently determines the complexities of 
the I, P, and B picture types in the output video stream. 

More specifically, the transcoder control unit 110 calculates 
Qo in response to a complexity ratio of: an accumulated complexity in the 

25 output video stream; to an accumulated complexity in the input video 
stream. The accumulated complexity in the input video stream is the 
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product of Qi times Ni, accumulated over a plurality of frames. Likewise, 
the accumulated complexity of the output video stream is the product of 
Qo times No, accumulated over the plurality of frames. 

Then, the transcoder control unit 110 calculates the 
5 complexity ratio (a k ) as follows: 

j=0 

a « =~1 >' 

;=0 

where j equals the plurality of frames; and, 
where k is the current frame. 
10 Thus, the transcoder control unit 110 calculates Qo, for each 

frame, as follows: 



Qo = (a k ■ Qi)lr. 



15 In some aspects of the system 100, the transcoder control 

unit 110 determines an actual bit rate ratio (r) for transcoding the video 
stream as follows: 



r' = No/Ni; 

20 where No and Ni are accumulated over a plurality of frames. 

Then, the transcoder control unit 110 determines a feedback 
correction factor (B k ) responsive to the value of r', and modifies the value 
of Qo in response to B k . More specifically, the transcoder control unit 110 
25 determines B k , for each frame, as follows: 

-15- 

slal359 



B k = r'/r. 



Thus, the transcoder control unit 1 10 calculates Qo, for each 
5 frame, as follows: 

Qo = (a k • Qi)/r ■ B k ; 

where the value of (and B k ) is updated after every frame. 



10 Functional Description 

The present invention method for picture-level rate control, 
is described below in the context of transcoding from MPEG-2, to an 
MPEG-4 video stream with a lower bit rate. However, it should be 
understood that the present invention method can be used for transcoding 

15 between other standards. 

Ideally, the transcoding result should be of the same visual 
quality as the MPEG-2 source. A straightforward way to do it is to fully 
decode the MPEG-2 stream and re-encode it using an MPEG-4 encoder. 
However, this approach ignores the fact that there is valuable information 

20 in the MPEG-2 stream that can be used for rate control. Since the present 
invention targets the problem of rate control, the picture type, average 
quantization parameter (Qp), and bit usage of each frame from the 
MPEG-2 stream can be used to control the MPEG-4 encoding. 

A modified cascade decoder and encoder, with an external 

25 rate control, are shown in Fig. 1. First, the MPEG-2 stream is decoded to 
extract needed information, referred to herein as decoding process 
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information. The average Qp of a frame is computed by averaging the Qp 
of all the slices in a frame. In the MPEG-4 encoding pass, the encoder 
uses the same picture type as the MPEG-2 frame, and it uses an external 
rate control instead of its internal one. A Qp scale factor is generated for 
5 each frame, to scale the input Qp (Qi) for use in encoding an MPEG-4 
frame. The present invention derives this scale factor. 

The ratio between the target MPEG-4 bit rate and the 
MPEG-2 bit rate is r (r < 1). The goal of the rate control is to achieve 

Where N denotes the number of bits actually used in one 
frame, subscript "o" represents the output MPEG-4, subscript "i" 
represents the input MPEG-2, and the subscript "k" represents the frame 
index. The sum is the sum of all the frames. The target number of bits of 
15 one MPEG-4 frame is set to be: 



K; k =r-N Kk (2) 

This will not only achieve the goal, as expressed in equation 
1, but also tracks the relative bit allocations of MPEG-2 source. It is 
20 known that number of bits used in a frame is loosely inverse proportional 
to its quantization parameter (Qp). So for each frame k, there must exist 
a constant a k so that: 



(3) 
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Where Q denotes Qp. Equation (3) can be expressed as: 



According to MPEG-2 Test Model 5 (TM5; ISO/IEC 
5 JTC1/SC29/WG11, MPEG-2 Test Model 5, April 1993a), picture 

complexity measure is defined as the product of the bits generated and the 
average quantization parameter. In TM5, separate complexity measures 
are defined for I, P, and B picture types. So actually, a k is the complexity 
ratio between the corresponding MPEG-4 and MPEG-2 frame. 

10 Complexity is a relative measure to describe the difficulties in coding a 

frame, as compared to other frames in the same sequence. The complexity 
values of the same frame in the input MPEG-2 and the transcoded MPEG- 
4 stream are likely to be different, because they are coded using different 
standards, encoders, and bit rates. Complexity changes are also 

15 dependent on picture types, and picture content. However, since the 

complexity changes for every frame result from the same cause (change of 
standard, encoders, and bit rate), it's reasonable to assume that the 
complexity change ratio for the same picture type is relatively constant, at 
least over a short period of time. 

20 Therefore, the present invention estimate a k is based on the 

accumulated complexities of the previous frames of the same type. Three 
complexity ratios are estimated separately for I, P, and B pictures. The 
accumulation may be done from the sequence start or from a GOP start. 
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(5) 

7=0 

Knowing a^, from Equation (3), the quantization parameter 
Qo of frame "k" can be initialized to be, 



Q. t - a -^ (6) 



For any estimation, a feedback correction factor is helpful. 
Initializing Q ok as in (6) can make the bits produced (N ok ) close to the 
target r ■ N iJ( , but the count won't be exactly equal. To achieve the target 
bit rate needed to meet Equation (1), another rate adjustment factor (B k ) 
is introduced. It is defined as the ratio of actually used bits verses the 
target bits. That is, the actual bit rate ratio (r') vs. the target bit rate 
ratio r. Again, factors for I, P, and B pictures are estimated separately. 



2X r . 

B>=^^= r f (7) 

j=0 

15 With this correction, combining Equation (5) and (7), 

Q ck becomes: 



a,, =^i.5, ( 8) 



Rate control for I ,P, and B frames is performed separately. 
20 In this manner, the bit allocation ratio among I, P, and B remains the 
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same as the MPEG-2 stream. The independent bit allocation ratio 
prevents a poor B picture, for example, from consuming too many bits, 
adversely affecting I and P picture qualities, as there are too few bits 
available for them. Adjustment to the bit allocation among these pictures 
5 may be performed, for example, using the method in MPEG-2 Test model 
5. 

Fig. 2 is a diagram with steps summarizing one aspect of the 
present invention rate control method. Using the present invention rate 
control method, the transcoding meets the target rate. Depending on the 
10 ratio of the bit rate reduction, the quality degrades smoothly along a 

whole sequence of frames, as compared to the MPEG-2 source. Compared 
to MPEG-4 encoding using an internal rate control (CBR-based rate 
control), the overall visual quality is better using the new rate control 
method. 

15 Fig. 3 through 5 are drawings comparing the peak signal-to- 

noise ratio (PSNR) of I, P, and B picture types, respectively, encoded using 
the present invention and CBR methods. In this experiment, the MPEG-2 
source was a high quality "Star War" movie trailer. It had 900 frames, 
and an average bit rate of 4.7Mb/s. This sequence included a lot of 

20 motions and scene changes. It was transcoded to MPEG-4 stream 

targeted at 80% of the original bit rate. With the constant bit rate (CBR) 
method, the encoder had no knowledge of the MPEG-2 sequence, and the 
"I" picture interval was set to be 15 frames. From the figures, it can be 
seen that the present invention method is better than the CBR method in 

25 most cases. It's especially obvious for I and P pictures. The good quality 
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is preserved along the whole sequence. The average PSNRs of coding I, P, 
and B pictures using both rate control methods are listed in Table 1. 





present invention 
rate control 
(PSNRs) 


CBR rate control 
(PSNRs) 


I Pictures 


46.3598 


42.260 


P Pictures 


41.945 


40.253 


B Pictures 


40.9600 


39.882 



1. Average PSNR using present invention and CBR 
rate control methods 



Fig. 6 is a flowchart illustrating the present invention 
method for adaptive rate control in the transcoding of video streams. 
Although the method is depicted as a sequence of numbered steps for 
clarity, no order should be inferred from the numbering unless explicitly 
stated. It should be understood that some of these steps may be skipped, 
performed in parallel, or performed without the requirement of 
maintaining a strict order of sequence. The method starts at Step 600. 

Step 602 accepts frames of an input MPEG encoded video 
stream. Step 604 decodes the video stream. Step 606 determines video 
stream complexity. Step 608, for each frame, calculates an output video 
stream quantization parameter (Qo) responsive to determined video 
stream complexity. Step 610 encodes the output video stream into a 
protocol using Qo. 
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In some aspects of the method, Step 607a accepts a target bit 
rate ratio (r) for transcoding the video stream that is equal to the ratio of 
the target output video stream number of bits per frame (No), to the input 
video stream number of bits per frame (M) as follows: 

5 

r = No/Ni. 

Then, calculating Qo responsive to the determined video 
stream complexity in Step 610 includes calculating Qo in response to the 
10 value of r. In one aspect, Step 610 encodes the output video stream into 
an MPEG-4 video stream using r. 

In some aspects, determining the video stream complexity of 
the input video stream in Step 606 includes calculating an average input 
video stream quantization factor (Qi) for each frame. Then, calculating Qo 
15 responsive to the determined video stream complexity in Step 610 
includes initially calculating Qo as follows: 

Qo = Qi/r. 

20 In some aspects, accepting frames of an input MPEG encoded 

video stream in Step 602 includes accepting frames with a plurality of 
slices. Calculating Qi for each frame in Step 606 includes calculating the 
quantization parameter by averaging the Qi values for each slice in a 
frame. In other aspects, accepting an input MPEG encoded video stream 

25 in Step 602 includes accepting intra (I), predictive (P), and bi-directionally 
predictive (B) picture types. Then, determining the video stream 
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complexity of the input MPEG encoded video stream in Step 606 includes 
substeps. Step 606a independently determines the complexities of the I, 
P, and B picture types in the input video stream. Step 606b 
independently determines the complexities of the I, P, and B picture types 
5 in the output video stream. 

In one aspect, determining the video stream complexity in 
Step 606 includes determining a complexity ratio: of an accumulated 
complexity in the output video stream, to an accumulated complexity in 
the input video stream. The accumulated complexity in the input video 
10 stream is the product of Qi times Ni, accumulated over a plurality of 
frames. The accumulated complexity of the output video stream is the 
product of Qo times No, accumulated over the plurality of frames. Thus, 
the complexity ratio (o^) can be expressed as follows: 

15 «*=i=i ; 

I(CV A '..,) 

where j equals the plurality of frames; and, 
where k is the current frame. 

Thus, calculating (Qo) in Step 610 includes calculating Qo, 
20 for each frame, as follows: 

Qo = (a, • Q0/r. 

In some aspects, a further step, Step 607bl determines an 
25 actual bit rate ratio (V) for transcoding the video stream as follows: 
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r' = iVo/M; 

where No and Ni are accumulated over a plurality of frames. 
Step 607b2 determines a feedback correction factor (B k ) responsive to the 
5 value of r'. Then, calculating Qo in Step 610 includes modifying the value 
of Qo in response to B k . 

In one aspect, determining B k in Step 607b includes 
determining B k , for each frame, as follows: 

10 B k = r'/r. 

Then, calculating Qo in Step 610 includes calculating Qo, for 
each frame, as follows: 

15 Qo = (a k -Qi)/vB k ; 

where the value of a k (as well as B k ) is updated after every 

frame. 

Fig. 7 is a flowchart depicting an alternate aspect of the 
present invention method for adaptive rate control in the transcoding of 

20 MPEG video streams. The method starts at Step 700. Step 702 accepts 
frames of an input MPEP-2 encoded video stream. Step 704 decodes the 
video stream. Step 706 determines a video stream complexity ratio: of an 
accumulated complexity in the output video stream, to an accumulated 
complexity in the input video stream. Step 708, for each frame, calculates 

25 an output video stream quantization parameter (Qo) in response to the 
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complexity ratio. Step 710 encodes the output video stream into an 
MPEG-4 protocol using Qo. 

A system and method have been provided for adaptive rate 
control in the transcoding of compressed video streams. An example of 
5 transcoding from the MPEG-2, to the MPEG-4 format has been given, but 
the invention is not limited to merely this example. Specifically 
descriptions of exemplary complexity determinations have also been 
provided. However, it should be understood that the invention is not 
limited to one particular formula. Other variations and embodiments of 
10 the invention will occur to those skilled in the art. 
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